explanatory feature
- North America > United States (0.14)
- Oceania > Australia > New South Wales (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (4 more...)
- North America > United States (0.14)
- Oceania > Australia > New South Wales (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Data Science > Data Mining (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
The Contextual Lasso: Sparse Linear Models via Deep Neural Networks
Thompson, Ryan, Dezfouli, Amir, Kohn, Robert
Sparse linear models are one of several core tools for interpretable machine learning, a field of emerging importance as predictive models permeate decision-making in many domains. Unfortunately, sparse linear models are far less flexible as functions of their input features than black-box models like deep neural networks. With this capability gap in mind, we study a not-uncommon situation where the input features dichotomize into two groups: explanatory features, which are candidates for inclusion as variables in an interpretable model, and contextual features, which select from the candidate variables and determine their effects. This dichotomy leads us to the contextual lasso, a new statistical estimator that fits a sparse linear model to the explanatory features such that the sparsity pattern and coefficients vary as a function of the contextual features. The fitting process learns this function nonparametrically via a deep neural network. To attain sparse coefficients, we train the network with a novel lasso regularizer in the form of a projection layer that maps the network's output onto the space of $\ell_1$-constrained linear models. An extensive suite of experiments on real and synthetic data suggests that the learned models, which remain highly transparent, can be sparser than the regular lasso without sacrificing the predictive power of a standard deep neural network.
- North America > United States (0.14)
- Oceania > Australia > New South Wales (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (4 more...)
Contextual Feature Selection with Conditional Stochastic Gates
Sristi, Ram Dyuthi, Lindenbaum, Ofir, Lavzin, Maria, Schiller, Jackie, Mishne, Gal, Benisty, Hadas
We study the problem of contextual feature selection, where the goal is to learn a predictive function while identifying subsets of informative features conditioned on specific contexts. Towards this goal, we generalize the recently proposed stochastic gates (STG) Yamada et al. [2020] by modeling the probabilistic gates as conditional Bernoulli variables whose parameters are predicted based on the contextual variables. Our new scheme, termed conditional-STG (c-STG), comprises two networks: a hypernetwork that establishes the mapping between contextual variables and probabilistic feature selection parameters and a prediction network that maps the selected feature to the response variable. Training the two networks simultaneously ensures the comprehensive incorporation of context and feature selection within a unified model. We provide a theoretical analysis to examine several properties of the proposed framework. Importantly, our model leads to improved flexibility and adaptability of feature selection and, therefore, can better capture the nuances and variations in the data. We apply c-STG to simulated and real-world datasets, including healthcare, housing, and neuroscience, and demonstrate that it effectively selects contextually meaningful features, thereby enhancing predictive performance and interpretability.
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > California > San Diego County > La Jolla (0.04)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- (3 more...)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.66)
Convex Fairness Constrained Model Using Causal Effect Estimators
Recent years have seen much research on fairness in machine learning. Here, mean difference (MD) or demographic parity is one of the most popular measures of fairness. However, MD quantifies not only discrimination but also explanatory bias which is the difference of outcomes justified by explanatory features. In this paper, we devise novel models, called FairCEEs, which remove discrimination while keeping explanatory bias. The models are based on estimators of causal effect utilizing propensity score analysis. We prove that FairCEEs with the squared loss theoretically outperform a naive MD constraint model. We provide an efficient algorithm for solving FairCEEs in regression and binary classification tasks. In our experiment on synthetic and real-world data in these two tasks, FairCEEs outperformed an existing model that considers explanatory bias in specific cases.
- Asia > Taiwan > Taiwan Province > Taipei (0.05)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (5 more...)
SPSA-FSR: Simultaneous Perturbation Stochastic Approximation for Feature Selection and Ranking
Yenice, Zeren D., Adhikari, Niranjan, Wong, Yong Kai, Aksakalli, Vural, Gumus, Alev Taskin, Abbasi, Babak
This manuscript presents the following: (1) an improved version of the Binary Simultaneous Perturbation Stochastic Approximation (SPSA) Method for feature selection in machine learning (Aksakalli and Malekipirbazari, Pattern Recognition Letters, Vol. 75, 2016) based on non-monotone iteration gains computed via the Barzilai and Borwein (BB) method, (2) its adaptation for feature ranking, and (3) comparison against popular methods on public benchmark datasets. The improved method, which we call SPSA-FSR, dramatically reduces the number of iterations required for convergence without impacting solution quality. SPSA-FSR can be used for feature ranking and feature selection both for classification and regression problems. After a review of the current state-of-the-art, we discuss our improvements in detail and present three sets of computational experiments: (1) comparison of SPSA-FS as a (wrapper) feature selection method against sequential methods as well as genetic algorithms, (2) comparison of SPSA-FS as a feature ranking method in a classification setting against random forest importance, chi-squared, and information main methods, and (3) comparison of SPSA-FS as a feature ranking method in a regression setting against minimum redundancy maximum relevance (MRMR), RELIEF, and linear correlation methods. The number of features in the datasets we use range from a few dozens to a few thousands. Our results indicate that SPSA-FS converges to a good feature set in no more than 100 iterations and therefore it is quite fast for a wrapper method. SPSA-FS also outperforms popular feature selection as well as feature ranking methods in majority of test cases, sometimes by a large margin, and it stands as a promising new feature selection and ranking method.
- Europe > Austria > Vienna (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > Wisconsin (0.04)
- (6 more...)
Semantics-Empowered Big Data Processing with Applications
Thirunarayan, Krishnaprasad (Kno.e.sis: Ohio Center of Excellence in Knowledge-Enabled Computing) | Sheth, Amit (Kno.e.sis: Ohio Center of Excellence in Knowledge-Enabled Computing)
We discuss the nature of big data and address the role of semantics in analyzing and processing big data that arises in the context of physical-cyber-social systems. To handle volume, we advocate semantic perception that can convert low-level observational data to higher-level abstractions more suitable for decision-making. To handle variety, we resort to semantic models and annotations of data so that intelligent processing can be done independent of heterogeneity of data formats and media. To handle velocity, we seek to use continuous semantics capability to dynamically create event or situation specific models and recognize relevant new concepts, entities and facts. To handle veracity, we explore trust models and approaches to glean trustworthiness. These four v's of big data are harnessed by the semantics-empowered analytics to derive value to support applications transcending physical-cyber-social continuum.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
- (14 more...)
- Transportation (1.00)
- Government > Regional Government (0.93)
- Information Technology > Services (0.93)
- (2 more...)